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1.  INTRODUCTION: 

Estimation  of  number,  amplitude  and  frequency  of  complex  sinusoids  in  a  signal  is  very 
important.  The  model  can  be  expressed  in  the  following  way.  Let  {yi, yn}  be  a  sample 
of  size  n,  where  y*  can  be  written  as 

M 

+  for  k  =  h---,n-  (!) 

The  amplitudes,  A's,  are  unknown  complex  numbers  and  the  frequencies,  a >'s,  are  unknown 
radian  frequencies,  between  0  to  27 r.  The  additive  errors,  e'ks,  are  complex  valued  Gaussian 
random  variables  and  they  are  independent  and  identically  distributed  (i.i.d.)  with  zero 
means.  The  real  and  imaginary  parts  of  e'ks  are  assumed  to  be  independent.  The  number  of 
signals,  M ,  is  unknown.  The  problem  is  to  estimate  M,  A\, . . . ,  Am,  and  U\, . . . ,  um- 

In  the  last  twenty  years  several  iterative  and  non-iterative  procedures  were  developed 
to  estimate  the  parameters  of  an  exponential  model  very  efficiently,  but  not  that  much  of 
attention  has  been  paid  on  estimating  the  number  of  signals.  See  for  example  Stoica  (1993) 
for  an  extensive  list  of  references  up  to  that  point  and  see  Kundu  and  Mitra  (1995,  1999) 
for  some  of  the  recent  references.  In  this  paper  we  mainly  concentrate  on  estimating  M. 
Tufts  and  Kumaresan  (1982)  also  proposed  some  graphical  techniques  to  estimate  M ,  which 
is  very  subjective  in  nature.  Some  of  the  other  techniques  for  example  Bartlett  (1954)  and 
Lawley  (1956)  can  be  used,  but  they  also  depend  on  the  subjective  choice  of  the  individual 
and  therefore  the  practical  implementation  becomes  difficult. 

Rao  (1988)  proposed  the  Information  Theoretic  Criteria  (ITC)  following  the  approach 
of  Zhao,  Krishnaiah  and  Bad  (1986)  on  estimating  the  number  of  signals  for  an  undamped 
exponential  model  (1).  He  did  not  provide  any  numerical  results  regarding  the  performances 
of  his  procedure.  It  is  observed  (Kundu;  1992)  that  Rao’s  suggestion  may  not  be  implemented 
very  easily  in  practice.  A  practical  implementation  procedure  was  suggested  by  Kundu 
(1992).  It  is  also  observed  that  different  ITC  depend  very  much  on  the  penalty  function 
used.  Some  suggestions  about  the  penalty  function  were  given  in  Kundu  (1992)  based  on 
the  extensive  computer  simulations.  It  is  not  very  difficult  to  show  (  as  it  is  correctly 
mentioned  by  Rao;  1988)  that  the  ITC  proposed  by  him  will  give  consistent  estimates  of 
M,  in  the  case  of  an  undamped  exponential  model  (1).  Bad  et  al.  (1987)  also  proposed 
a  method  known  as  the  Equi Variance  Linear  Prediction  (EVLP)  method  on  estimating  M 
for  an  undamped  exponential  model  and  proved  the  strong  consistency  of  the  EVLP  when 
the  errors  are  not  necessarily  Gaussian.  But  extensive  numerical  simulations  (Kundu;  1992) 
suggest  that  the  EVLP  does  not  work  well  for  small  sample  sizes  even  when  the  errors  axe 
Gaussian. 

In  this  paper  we  estimate  M,  through  Cross  Validation  (CV)  approach  which  usually 
performs  well  for  small  and  moderate  sample  sizes.  Rao  (1988)  first  mentioned  that  the 
CV  technique  can  be  used  on  estimating  M  for  model  (1).  He  provided  certain  CV  scheme 
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for  practical  implementation.  However,  he  did  not  provide  any  numerical  results  regarding 
the  performances  of  his  proposed  CV  procedure.  It  is  observed  that  Rao’s  CV  approach 
for  model  (1)  is  not  very  easy  to  implement  in  practice  as  it  was  suggested.  We  propose  a 
new  simple  CV  procedure  for  an  undamped  exponential  model  using  missing  value  technique 
discussed  in  Section  3.  We  perform  detailed  numerical  experiments  to  compare  different  ITC 
and  the  proposed  CV  procedure  using  different  models.  It  is  observed  that  the  performance 
of  the  CV  approach  is  quite  similar  with  that  of  the  best  performed  ITC  and  in  certain  cases 
it  works  marginally  better  also. 

The  rest  of  the  paper  is  organized  as  follows.  In  Section  2,  we  give  a  brief  description 
of  the  different  ITC  and  the  estimation  of  the  different  parameters  in  presence  of  missing 
value  is  discussed  in  Section  3.  CV  approach  and  its  implementation  are  discussed  in  Section 
4.  The  results  of  the  numerical  experiments  are  presented  in  Section  5  and  finally  we  draw 
conclusions  from  our  work  in  Section  6. 


2.  DIFFERENT  INFORMATION  THEORETIC  CRITERIA: 

Let  {yl5. . . ,  yn}  be  a  sample  of  size  n  from  the  model  (1).  Let  L  be  the  parameter  ranges 
over  all  possible  number  of  signals,  i.e.  L  6  {1, . . . ,  K},  where  K  is  some  preassigned  fixed 
number.  We  make  the  assumption  that  the  number  of  signals  can  be  at  most  equal  to  K. 
Then  it  follows  that  the  joint  density  function  of  the  observed  data  is  given  by 


fivlh)  =  (V*F)»a"elp{  2<r2  £ |pi  -  P‘(<WI 

(2) 

where 

=  (*^lj  -  •  •  j  ^-Lj  ^1?  •  ■  •  ?  &l) 

(3) 

and 

L 

«(fe)  =  EV»‘- 

i=i 

(4) 

The  problem 

can  now  be  formulated  as  follows:  Given  the  family  of  models 

(5) 

select  the  true  one.  Posed  in  this  way  the  problem  becomes  a  model  selection  problem  and 
perfectly  suited  for  using  the  different  ITC  such  as  Akaike  Information  Criterion  (AIC)  of 
Akaike  (1973,  1974)  or  Minimum  Description  Length  (MDL)  criterion  (Best  Information 
Criterion  (BIC))  of  Schwartz  (1978)  or  Risannen  (1978)  or  the  Efficient  Detection  Criterion 
(EDC)  of  Zhao,  Krishnaiah  and  Bai  (1986).  The  AIC,  MDL  or  the  EDC  criteria  are  known  as 
the  penalized  likelihood  method  in  the  Statistical  Model  selection  literature.  Here  a  penalty 
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function  is  subtracted  from  the  log-likelihood  function  before  it  is  maximized.  This  serves 
to  penalize  or  discourage  the  addition  of  more  and  more  parameters. 

The  AIC  suggests  to  choose  M,  an  estimator  of  M,  which  minimizes  the  following  ex¬ 
pression 

AIC(L)  =  -log!( y|»t)  +  d(h),  (6) 

for  L  =  1, . . . ,  K,  where  9i  is  the  maximum  likelihood  estimator  (MLE)  of  9i  and  d(9L)  = 
3L  +  1,  the  total  number  of  independent  parameters  when  the  model  order  is  L.  Akaike’s 
basic  idea  was  to  choose  the  model  that  minimizes  the  mean  of  the  Kullback-Liebler  distance 
between  the  true  density  f(y\0L )  and  the  estimated  density  /(y|0i,).  Since  the  distance  is 
unknown,  he  proposed  to  estimate  it  by  the  log-likelihood  value  at  the  point  of  the  MLE.  The 
second  term  in  (6)  was  added  to  make  the  log-likelihood  function  at  the  MLE  an  unbiased 
estimator  of  the  Kullback-Leibler  distance. 

MDL  criterion  was  introduced  by  Risannen  (1978).  The  basic  idea  is  that  the  best  model 
is  the  one  that  provides  the  shortest  description  of  the  data.  It  is  observed  (Risannen;  1983) 
that  for  large  samples  this  estimator  leads  to  the  selection  of  the  model  that  minimizes 

MDL(L)  =  -log  f(, y\0L)  +  \d{0L)logn,  (7) 

for  L  =  1, . . . ,  K,  where  f(y\0L)  and  d(§L )  =  3L  +  1  are  same  as  defined  before. 

Schwartz  (1978)  suggested  a  model  selection  criterion  based  on  the  Bayesian  arguments. 
Assuming  a  priori  probabilities  for  every  competing  models,  he  proposed  selecting  the  model 
that  maximizes  the  posterior  probability.  It  was  shown  that  for  a  model  belonging  to  an 
exponential  family,  the  maximization  of  the  posterior  probability  leads  to  the  minimization 
of  the  criterion  given  by  (7)  asymptotically. 

The  Efficient  Detection  Criterion  (EDC)  of  Zhao,  Krishnaiah  and  Bai  (1986)  consists  of 
choosing  an  estimator  M  of  M,  which  minimizes 

EDC(L)  =  -log  ft y\0L)  +  Cnd(0L),  (8) 

for  L  =  1, . . . ,  K,  where  Cn’s  are  such  that 

t 

lim  —  =  0,  lim  —  =  oo  (9) 

n-+oo  n  n-KX)  loglogn 

and  d(§i)  =  3L  + 1  (see  Rao  ;1988).  Observe  that  MDL  criterion  is  a  special  case  of  EDC. 
For  MDL,  Cn  takes  the  value  \logn  in  (8).  It  has  been  mentioned  in  Rao  (1988)  (the 
proof  is  not  very  difficult)  that  for  an  undamped  exponential  model  the  estimators  obtained 
from  EDC  or  MDL  are  strongly  consistent  estimators  of  M ,  where  as  AIC  estimator  is  not 
consistent.  Although  any  Cn  satisfying  (9)  gives  strongly  consistent  estimator  of  M,  but 
unfortunately  it  is  observed  that  the  small  sample  performances  of  the  estimators  depend 
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very  much  on  the  choice  of  Cn  (see  Kundu;  1992).  Kundu  (1992)  used  a  wide  variety 
of  penalty  functions,  namely,  Cn  =  NA,  Cn  =  N-2,  Cn  =  N-3,  Cn  =  NA,  Cn  —  logn, 
Cn  =  ( logn )-2,  Cn  =  (logn)A,  Cn  =  (logn)-6,  Cn  =  (logn)-6,  Cn  =  (nlogn)1,  Cn  =  (nlogn)-3, 
Cn  =  (nlogn)-5,  Cn  =  (nlogn)-7,  Cn  =  (nlogn)-9.  All  of  them  satisfy  (9)  and  they  diverge  to 
infinity  at  different  rates.  Extensive  numerical  simulations  indicate  that  Cn  =  logn  can  be 
used  as  a  good  choice  of  Cn  for  an  undamped  exponential  model,  although  no  theoretical 
justification  can  be  given  in  favor  of  this.  Experimentally  it  is  observed  that  the  penalty 
function  \logn  is  relatively  milder  compared  to  logn  and  therefore  it  over  estimates  M.  On 
the  other  hand  the  penalty  function  logn  looks  appropriate  at  least  for  Gaussian  error. 


3.  ESTIMATION  OF  THE  UNKNOWN  PARAMETERS  IN  PRESENCE  OF 
MISSING  VALUE  FOR  FIXED  M 

In  this  section  we  discuss  about  the  estimation  of  the  unknown  parameters  if  one  ob¬ 
servation  is  missing  and  when  the  model  order  is  known  a  priori.  Since  all  the  existing 
eigen  decomposition  methods  use  the  fact  that  the  data  axe  equispaced,  therefore  it  is  not 
immediate  how  they  can  be  used  if  one  observation  is  missing.  It  is  well  known  that  the 
Modified  Forward  Backward  Linear  Prediction  (MFBLP)  method  of  Tufts  and  Kumar es an 
(1982)  works  very  well  with  short  data  length  and  moderate  signal  to  noise  ratio.  Its  prac¬ 
tical  implementation  is  also  quite  simple.  In  this  section  we  observe  that  the  MFBLP  can 
be  further  modified  and  can  be  used  even  if  one  observation  is  missing  and  when  the  model 
order  is  known. 

Note  that  for  known  M,  there  exists  a  vector  g  =  (<7i,  •  • .  ,gj)  such  that  in  the  noiseless 
data,  the  forward  backward  prediction  equations  are  as  follows  (Tufts  and  Kumar esan;  1982) 


yj  •  ■  •  yi 

'9i 

yj+ 1 

yn-i  •  •  •  yn-j 
j/2  •  ■  •  yj+i 

; 

=  — 

Vn 

yi 

_  i  __ 

- 9j - 

1 

mVn— J+l  •  •  •  Vn  - 

.  Vn—J . 

(10) 


here  M  <  J  <  n  —  M  and  denotes  the  complex  conjugate  of  a  complex  number.  Now  if 
the  m^1  observation  is  missing,  we  can  delete  the  corresponding  rows  from  the  left  as  well  as 
from  the  right  wherever  ym  is  appearing.  For  example  ifJ  +  l<m<n  —  J  —  1,  then  (10) 
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can  be  written  as 


yj 

yi 

yj+i 

Vm—2 

ym—j 

ym- 1 

Vm+J 

i/m+1 

Vm+J +1 

2/n-l 

• . -  yn-j 

m9i 

yn 

V 2 

•  ■  •  yj+i 

: 

yi 

\ 

■  j 

-0J- 

Vm-J 

-  •  •  Vm-l 

Vm—J—l 

ym+2 

I/m+J-f  1 

ym+i 

-Vn-J+1 

•  •  •  Vn 

.  Vn-J  . 

(11) 


For  m<J+l  or  m>  n  —  J 
be  written  as 


1  it  can  be  defined  similarly.  The  system  of  equations  can 


-A-m§  —  hm> 


(12) 


where  the  matrix  Am  and  the  vector  h^,  depend  on  the  value  of  the  missing  observation. 
Now  the  minimum  norm  solution  of  g  is  given  by 


g  =  -  (AjjAm)  Ajh™.  (13) 

here  ‘H’  and  denote  the  complex  conjugate  transpose  of  a  matrix  and  pseudo  inverse  of 
a  matrix  respectively  as  given  in  Rao  (1973).  Now  if  we  use  the  usual  linear  prediction 
notations,  R  =  A^Am  and  r  =  — A^h,  then  it  can  be  easily  shown  similarly  as  Tufts  and 
Kumaresan  (1982),  that  in  the  noiseless  data; 


g  = 


(14) 


where  7i  >  . . .  >  >  7at+i  =  . . .  =  tj  =  0  axe  the  eigen  values  of  R  and  u<,  i  =  1, . . . ,  J 

are  the  corresponding  orthonormal  eigen  vectors  of  R.  Form  the  prediction  error  polynomial 
equations  with  the  vector  g  as  follows, 

i 

H{z)  =  1  ■+■  g\z  + . . .  +  gjzJ  —  0.  (15) 


The  equation  (15)  has  J  roots.  It  can  be  shown  in  the  same  way  as  Kumaresan  (1982)  that 
in  case  of  noiseless  data,  out  of  J  roots  of  (15),  M  of  them  will  be  at  e""*  for  j  =  1, . . . ,  M 
and  J  —  M  other  roots  will  have  magnitudes  strictly  less  than  one  and  will  be  distributed 
uniformly  over  the  unit  circle. 

In  case  of  noisy  data,  first  estimate  g  from  (14),  form  the  polynomial  equation  (15)  and 
obtain  the  J  roots  of  the  prediction  polynomial  equation.  Once  we  obtain  the  J  roots,  the 
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estimates  of  Uj  s  can  be  obtained  from  those  M  roots  whose  magnitudes  are  closest  to  one. 
Note  that  once  we  estimate  the  non-linear  ujj  s  the  linear  Aj  s  can  be  estimated  very  easily 
by  the  linear  regression  technique  as  given  below.  The  model  (1)  can  be  written  as 


'yr 

-  eiwi 

...  ei"M 

’  Ai  ‘ 

‘  ei  ‘ 

l 

l 

• 

+ 

; 

-  Un  - 

einw, 

einuM 

.Am. 

.cm. 

(16) 


Let’s  write  (16)  as 


Y  =  f2A  +  E. 


If  the  matrix  f2  is  completely  known  then  the  least  squares  estimators,  A,  of  A  can  be 
written  as  (see  Kundu;  1993) 

A  =  (nffn)_1DffY. 


4.  CROSS  VALIDATION  APPROACH: 


In  this  section  we  observe  how  we  can  use  the  missing  value  technique  discussed  in 
Section  2  for  model  ’order  selection  of  the  model  (1).  A  practical  and  satisfactory  method 
in  model  selection  for  small  samples  is  the  Cross  Validation  approach,  originally  proposed 
by  Lachenbruch  (1975)  and  Stone  (1974,  1977a, b).  Dawid  (1974)  suggested  that  in  certain 
circumstances  CV  might  lead  to  inconsistency  but  considerable  interest  has  been  shown 
recently  in  the  use  of  cross  validatory  procedure  because  of  its  satisfactory  performances  in 
small  samples.  In  the  exponential  signal  model  (1),  Rao  (1988)  proposed  the  following  CV 
technique: 


(1)  For  any  fixed  L,  leave  out  one  of  the  observation,  say  ym,  and  replace  it  by  Ym(L).  Then 
for  any  choice  of  L  and  Ym(L),  compute 


R(Ym(L) 


{n  L  L 

^  ^  Lt.  _  X  A  i  | \r  (  t  \  \  N  A 


E  l»-EV",Y+|i'™(£)-EV“'' 

k=l,k^m  j—l  j=l 


(17) 


where  A  =  (Ai, . . . ,  AL )  and  u  =  (ult . . . ,  ui). 

i 

(2)  For  any  given  L,  using  a  ‘suitable  computer  program’,  find  Ym(L)  such  that 

R(Ym(L),L)  =  min  R(Ym(L),  L), 


(18) 


which  provides  Ym(L)  is  an  estimate  of  ym  for  a  given  L.  Then  comparing  Ym(L)  with  the 
observed  ym,  the  cross  validatory  error  is  obtained  as 


*.(£)  =  E  Ifc  -  Ym(L)\2  for  L  =  1 . K. 

m=  1 
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Finally  M  is  chosen  such  that  it  minimizes  R*(L). 

First  we  would  like  to  make  some  comments  regarding  the  practical  implementation  and 
the  numerical  difficulties  encountered  about  the  above  mentioned  CV  algorithm.  Observe 
that  for  a  fixed  Ym(L),  (17)  is  a  non-linear  problem  and  it  is  not  possible  to  obtain  an 
explicit  expression  of  R(Ym(L),  L).  Furthermore,  the  least  squares  minimization  problem  of 
(17)  is  well  known  for  its  numerical  instability  (see  for  example  Kundu  (1993, 1994),  Breslar 
and  Macovski  (1986),  Varah  (1985)).  The  least  squares  estimators  often  depend  on  the 
initial  values  and  some  times  the  iterative  procedure  may  not  even  converge.  Therefore  the 
minimization  of  R(Ym(L ),  L)  with  respect  to  Ym(L)  may  not  be  as  simple  as  it  was  suggested. 
We  suggest  the  following  CV  procedure  which  is  as  follows; 

[1]  For  fixed  L,  leave  out  one  of  the  observation,  say  ym. 

[2]  Estimate  A  and  u  from  yi, . . . ,  ym-i,  ym+u  •  •  •  ?  using  the  missing  value  technique 
discussed  in  the  previous  section. 

[3]  Estimate  ym,  say  ym(L),  by 

ym(L)  =  jr  V8*".  (20) 

i=  i 

where  Aj  and  uij  are  the  estimates  obtained  from  Step  2. 

[4]  Obtain  the  cross  validatory  error  as  Rao  (1988)  suggested  by 

*•(£)  -  El*.  -  vM)?  for  L  =  1 . K.  (21) 

m=l 

[5]  Finally  M  is  chosen  such  that  it  minimizes  R*(L). 

Observe  that  our  method  is  quite  easy  to  implement  in  practice.  Since  the  method 
discussed  in  Section  3,  is  non-iterative  in  nature,  therefore  the  estimators  of  A  and  u>  can  be 
obtained  quite  easily  in  Step  2.  Note  that  Rao’s  YM(L )  is  the  best  least  squares  predictor 
of  ym  for  a  given  L.  He  did  not  require  to  estimate  separately  A/s  and  Uj’s  on  estimating 
ym.  Since  obtaining  Rao’s  Ym(L )  is  numerically  difficult,  we  are  approximating  it  by  ym(L), 
which  is  much  easier  to  obtain.  Intuitively  it  seems 

E(R*(L))>E{R,(L));  for  L  =  l,...,K.  (22) 

Since  (22)  it  is  true  for  all  L,  on  estimating  M  it  should  not  make  much  difference. 


8 


5.  NUMERICAL  EXPERIMENTS  AND  DISCUSSIONS: 

In  this  section  we  present  some  results  of  the  numerical  experiments.  All  the  computations 
are  performed  in  HP-9000  using  the  IMSL  random  number  generator.  We  consider  the 
following  three  models: 

Model  1:  y*  =  exp(7r/4)exp(i27r(.50)A:)  +  exp(n/A)exp(i2Tr(.52)k)  +  e*, 

Model  2:  y*  =  exp(7r/4)exp(i27r(.50)A;)  +  exp(7r/4)exp(i27r(.60)A;)  -be*., 

Model  3:  y*  =  exp(7r/4)exp(i27r(.50)A;) +exp(7r/4)exp(i27r(.60)A:) 

+exp(n /2)exp(i2Tr(.20)k)  +  exp(n/2)exp(i2‘K(.25)k)  +  e*. 

Note  that  for  Model  1  and  Model  2,  the  amplitudes  are  taken  to  be  equal.  The  difference 
of  the  radian  frequencies  is  more  in  Model  2  than  in  Model  1.  It  is  well  known  that  if  the 
difference  is  more  then  the  estimators  are  more  accurate.  Between,  Model  2  and  Model  3, 
two  components  are  exactly  same  in  both  the  models.  Since  Model  3  has  more  parameters 
compared  to  Model  2,  it  is  expected  that  the  estimators  will  be  less  accurate  for  Model  3  than 
Model  2.  No  such  comparisons  can  be  made  between  Model  1  and  Model  3.  The  real  and 
the  imaginary  parts  of  efc’s  are  normally  distributed  with  zero  mean  and  finite  variance  a2 /2 
and  they  are  independent  also.  The  sample  sizes  are  taken  as  n  —  25,  40  and  55  and  SNR  = 
5dB,  lOdB  and  15dB.  It  is  assumed  that  the  maximum  number  of  signals  is  6,  i.e.  K  =  6. 
For  each  data  set  we  estimate  M  by  AIC,  MDL  and  by  EDC.  For  EDC  we  take  Cn  =  logn , 
as  suggested  in  Kundu(1992).  We  estimate  M  also  by  the  proposed  CV  method.  We  use 
different  order  (within  admissible  range)  J  of  the  prediction  equations  in  (10)  to  estimate  the 
model  parameters  in  presence  of  missing  value  and  in  turn  estimating  M.  If  the  prediction 
order  is  J  (>  6)  then  the  corresponding  CV  method  is  denoted  by  C V(J).  We  take  different 
values  of  J  for  different  sample  sizes.  For  each  sample  size  and  for  each  SNR  and  for  each 
model,  we  replicate  the  process  five  hundred  times.  The  results  are  presented  in  Tables  1-3. 
We  present  the  percentage  of  correct  estimates  (PCE),  percentage  of  over  estimates  (POE) 
and  the  percentage  of  under  estimates  (PUE).  The  entry  in  each  table  represents  the  PUE, 
the  PCE  and  the  POE  for  different  methods  over  five  hundred  replications. 

Comparing  the  tables  it  is  observed  that  for  fixed  n  as  SNR  increases  the  performances 
of  almost  all  the  methods  improve  and  for  fixed  SNR  as  n  increases  the  performances  also 
improve  except  for  AIC.  It  is  observed  that  since  the  difference  between  the  two  frequencies 
in  Model  2  is  more  than  in  Model  1,  estimators  of  the  number  of  signals  are  more  accurate 
in  Model  2  than  in  Model  1.  Between  Model  2  and  Model  3,  since  the  number  of  parameter 
is  more  in  Model  3,  it  is  observed  that  the  estimators  become  less  accurate  for  Model  3. 
Among  the  different  methods,  the  performance  of  AIC  is  quite  poor.  The  percentage  of 
correct  estimation  does  not  exceed  60%  for  AIC  even  at  high  SNR.  As  sample  size  increases 
the  PCE  decrease  for  AIC  for  fixed  SNR.  The  inconsistency  of  the  AIC  is  quite  prominent 
in  this  situation.  MDL  works  quite  satisfactory  at  high  SNR,  although  the  percentage  of 
correct  estimation  decreases  between  80%  to  85%  if  the  SNR  is  low.  The  performance  of 
EDC  is  quite  satisfactory  with  Cn  =  logn.  The  consistency  of  MDL  or  EDC  can  be  verified 
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from  the  experiment.  The  performance  of  the  CV  method  depends  on  the  order  of  the 
prediction  equations.  As  the  prediction  order  J  increases,  the  percentage  of  the  correct 
estimate  increases  up  to  a  certain  point  then  it  starts  decreasing.  It  is  observed  in  our 
experiment  that  if  J  «  then  the  percentage  of  correct  detection  of  CV  (J)  is  maximum, 
although  we  cannot  give  any  theoretical  justification  of  this.  It  seems  more  work  is  needed 
in  this  direction.  It  is  also  observed  that  if  J  ~  j,  then  the  CV(«7)  works  much  better  than 
AIC  or  MDL  and  the  performance  of  CV(  J)  is  quite  comparable  with  the  best  performed 
EDC  and  in  certain  cases  CV ( J)  works  marginally  better  than  the  best  performed  EDC. 

It  is  well  known  that  the  usual  cross  validation  method  does  not  give  consistent  estimates 
of  the  model  order  estimation,  but  sometimes  it  might  provide  consistent  estimates  for  the 
model  order  estimation.  In  this  case  it  seems  if  we  take  J  ~  j,  it  might  provide  consistent 
estimates  of  M. 

Let’s  consider  computational  complexities  of  the  different  methods.  From  computational 
point  of  view  AIC,  MDL  or  EDC  are  much  faster  than  the  CV  method  if  AT  is  small.  If  the 
maximum  model  dimension  K  is  large,  then  AIC,  MDL  or  EDC  become  more  complicated, 
because  they  need  to  solve  non-linear  equations  in  a  2 K  dimensions.  The  solutions  may 
depend  on  the  initial  values  and  they  may  lead  to  a  local  minimum  rather  than  a  global  min¬ 
imum.  On  the  other  hand  if  the  sample  size  is  large  then  the  CV  computations  become  more 
time  consuming,  although  implementation  is  quite  simple.  The  proposed  CV  computation 
does  not  require  any  initial  values. 


6.  CONCLUSIONS: 

In  this  paper  we  consider  the  estimation  of  the  number  of  signals  of  the  undamped 
exponential  models,  which  is  a  very  important  problem  in  Statistical  Signal  Processing.  We 
propose  a  new  CV  approach  based  on  the  missing  value  technique.  It  is  observed  that  the 
proposed  CV  method  works  quite  well  and  it  performs  better  than  the  usual  AIC  and  MDL. 
The  performance  is  quite  comparable  to  the  best  performed  EDC.  Another  point  we  would 
like  to  point  out  that  although,  we  assume  that  the  errors  are  complex  Gaussian  random 
variables,  but  it  is  not  being  used  except  implicitly  at  (21).  Therefore,  it  seems  the  CV 
procedure  should  work  even  if  the  errors  are  not  from  a  complex  Gaussian  random  variable 
but  from  any  light  tail  distributions.  Where  as  for  AIC,  MDL  or  EDC  the  exact  distributional 
assumptions  of  the  error  random  variables  are  very  important  for  their  implementation. 
Comparing  all  the  points  it  is  suggested  that  the  CV  method  can  be  used  to  estimate  the 
number  of  signals  for  the  model  (1)  in  many  situations. 
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Table  1 
Model  1 
Sample  Size  25 


SNR  =  15 _ SNR  =  10 _ SNR  =  5 


Methods 

PUE 

PCE 

POE 

PUE 

PCE 

POE 

PUE 

PCE 

POE 

CV(6) 

0 

75 

25 

0 

51 

49 

0 

16 

84 

CV(8) 

0 

96 

4 

0 

90 

3 

85 

12 

CV(10) 

0 

94 

6 

13 

78 

9 

0 

53 

47 

AIC 

0 

55 

45 

0 

56 

44 

0 

56 

44 

MDL 

0 

87 

13 

0 

87 

13 

0 

80 

20 

EDC 

94 

6 

0 

90 

mi 

0 

80 

20 

Sample  Size  40 

SNR  =  15 _ SNR  =  10 _ SNR  =  5 


Methods 

PUE 

PCE 

POE 

PUE 

PCE 

POE 

n 

POE 

CV(6) 

77 

23 

0 

71 

29 

0 

56 

44 

CV(8) 

93 

7 

0 

10 

0 

88 

12 

94 

6 

0 

94 

6 

0 

92 

8 

CV(13) 

97 

3 

0 

95 

5 

0 

94 

6 

CV(15) 

95 

5 

0 

93 

7 

0 

91 

9 

AIC 

48 

52 

0 

46 

54 

0 

43 

57 

MDL 

91 

9 

0 

88 

12 

0 

89 

11 

EDC 

97 

3 

0 

93 

7 

0 

94 

6 

Sample  Size  55 

SNR  =15  SNR  =10  SNR  —  5 


Methods 

PUE 

PCE 

POE 

[PUE 

[■anaB 

PCE 

POE 

CV(6) 

85 

15 

83 

17 

CV(8) 

90 

■9 

88 

12 

CV(12) 

96 

4  • 

95 

5 

CV(18) 

98 

2 

97 

3 

97 

3 

96 

4 

AIC 

42 

58 

38 

62 

MDL 

87 

13 

15 

85 

15 

EDC 

97 

3 

3 

93 

7 

13 


r  . 

**  * 

Table  2 
Model  2 
Sample  Size  25 


SNR  =  15  SNR  =10  SNR.  =  5 


Methods 

PUE 

PCE 

POE 

PUE 

PCE 

POE 

CV(6) 

0 

57 

43 

55 

CV(8) 

0 

100 

0 

m 

■tMiw 

0 

3 

0 

95 

93 

n 

AIC 

0 

44 

mm 

53 

1  I 

52 

MDL 

0 

10 

n 

88 

o 

86 

o 

EDC 

0 

99 

1 

Bfl 

99 

H 

99 

■9 

Sample  Size  40 

MR=J£  ,, _  SNR  =  10  SNR  =  5 


Methods 

PUE 

PCE 

POE 

PUE 

PCE 

POE 

PUE 

PCE 

POE 

CV(6) 

0 

70 

30 

MM 

65 

35 

CV(8) 

0 

91 

9 

n 

90 

Kb 

im 

0 

93 

7 

D 

93 

7 

CV(13) 

0 

98 

2 

H 

97 

3 

CV(15) 

0 

95 

5 

Hi 

94 

6 

AIC 

0 

46 

54 

56 

43 

57 

MDL 

0 

87 

13 

13 

86 

14 

EDC 

0 

98 

2 

0 

98 

2 

97 

3 

Sample  Size  55 

SNR 

=  15 

SNR 

=  10 

SNR  = 

5 

Methods 

issa 

POE 

PUE 

PCE 

POE 

1MJ 

POE 

CV(6) 

0 

86 

14 

0 

84 

16 

0 

ilia 

CV(8) 

0 

94 

6 

0 

91 

9 

0 

CV(12) 

0 

95 

5 

0 

93 

7 

0 

5 

CV(18) 

0 

100 

0 

0 

100 

0 

0 

mu 

0 

0 

98 

2 

0 

97 

3 

0 

96 

4 

AIC 

0 

37 

63 

0 

37 

63 

0 

35 

65 

MDL 

0 

88 

12 

0 

88 

12 

0 

86 

14 

EDC 

0 

100 

0 

0 

99 

0 

0 

99 

0 

14 


Methods 

PUE 

PCE  1 

POE 

Table  3 
Model  3 
Sample  Size  25 
SNR  =  15  SNR  =  10 


PUE 

PCE 

0 

52 

0 

93 

0 

88 

0 

54 

0 

85 

0 

89 

SNR  =  5 


PUE 

PCE 

POE 

0 

0 

1 

78 

12 

0 

63 

37 

0 

54 

46 

o 

80 

20 

o 

82 

18 

Sample  Size  40 


Methods 

PUE 

PCE 

POE 

PUE 

PCE 

POE 

PUE 

PCE 

POE 

CV(6) 
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